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REMARKS 

This amendment is responsive to the Office Action dated February 26, 2004. Applicant 
has amended claims 1,16, 22, and 26 and canceled claims 5, 13-15, 23-25, and 30. Applicant 
has also added claims 32-39. Claims 1-4, 6-12, 16-22, 26-29, and 31-39 are now pending. 

Claim Rejection Under 35 U.S.C. § 102 

In the Office Action, the Examiner rejected claims 1-10 and 12-29 under 35 U.S.C. 
§ 102(a) as being anticipated by Haritaoglu (Scene Text Extraction and Translation for Handheld 
Devices). 

With this Response, Applicants have submitted a Declaration Under 37 C.F.R. § 1.131. 
The Declaration, and accompanying Exhibit, establish that Applicant conceived the inventions 
set forth in claims 1-10 and 12-39 of this application prior to the date of the Haritaoglu reference, 
i.e., December 8, 2001, and worked on the filing of a patent application with due diligence from 
a time prior to the date of the Haritaoglu reference to the filing date of this application, i.e., 
December 2 1 , 2001 . On the basis of the Declaration, Applicant submits the claimed invention 
was clearly conceived prior to the date of the Haritaoglu reference, and diligently reduced to 
practice by way of a construction reduction through the filing of this application. 

Applicants do not acquiesce in the Examiner's rejection under section 102, nor any 
characterization of the scope and content of the Haritaoglu reference. In view of the Declaration 
and Exhibits, however, Applicants respectfully submit that Haritaoglu does not qualify as prior 
art, and therefore request that the rejections of claims 1-10 and 12-29 on the basis of the 
Haritaoglu reference, be withdrawn. 

Claim Rejection Under 35 U.S.C. § 103 

In the Office Action, the Examiner rejected claims 11,30 and 3 1 under 35 U.S.C. 
§ 103(a) as being unpatentable over Chong et al. (US 5,535,120). Applicant respectfully 
traverses the rejection. 

Applicant notes at the outset that the Examiner's apparent intent was to reject claims 1 1 , 
30 and 31 under 35 U.S.C. § 103(a) as being unpatentable over Haritaoglu in view of Chong. In 
the Examiner's detailed comments under 35 U.S.C. § 103(a), the Examiner noted that Haritaoglu 
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does not teach providing identification of the first and second language and the dictionary to use. 
The Examiner then goes on to explain that Chong teaches such features. Therefore, it appears 
that the Examiner meant to reject claims 1 1 and 30-3 1 of the Office Action under 35 U.S.C. 
§ 103(a) as being unpatentable over Haritaoglu in view of Chong. 

In view of the Declaration and Exhibit submitted with this Amendment, Applicants 
respectfully submit Haritaoglu does not qualify as prior art. Accordingly, the rejection of claims 
11,30 and 3 1, to the extent it relies on the Haritaoglu reference, should now be withdrawn. 

Applicants do not acquiesce to or admit in any way to the propriety of the rejections 
advanced by the Examiner under sections 102 and 103 with respect to claims 1-10 and 12-31. 
On the contrary, such claims recite a number of features that are neither disclosed nor suggested 
by the applied references. The Declaration should render moot such rejections, however, and 
expedite allowance of the pending claims. 

Other art previously relied on by the Examiner, particularly Chong et al. (US 5,535,120) 
and Yamauchi et al. (US 5,701,497), fail to disclose each and every feature of the claimed 
invention, as required by 35 U.S.C. §§ 102 and 103, and provide no teaching that would have 
suggested the desirability of modification to include such features. Claim 1 as amended, for 
example, recites establishing a wireless connection, transmitting an image containing text in a 
first language over the network via the wireless connection and receiving a translation of the text 
in a second language over the network via the wireless connection. Similarly, claim 16 as 
amended recites transmitting an image and receiving a translation of the image over a network 
via a wireless connection. In addition, claim 26 recites a client device that transmits and image 
over the network to a remote server via a wireless connection and receives a translation from the 
remote server via the wireless connection. Neither Chong nor Yamauchi discusses wireless 
connections at all and, particularly establishing a wireless connection, transmitting an image 
containing text in a first language over the network via the wireless connection and receiving a 
translation of the text in a second language over the network via the wireless connection. 

Claim 28 as amended recites capturing a first image containing text with an image 
capture device and generating from the first image a second image containing text in response to 
a command from a user. Neither Chong nor Yamauchi disclose any editing capability or the 
desirability of the same. 
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For at least these reasons, the Examiner has failed to establish a prima facie case for non- 
patentability of Applicant's claims 1-10 and 12-29 under 35 U.S.C. § 102(b) and claims 1 1 and 
31 under 35 U.S.C. § 103(a). Withdrawal of these rejections is requested. 

New Claims: 

Applicant has added claims 32-39 to the pending application. The applied references fail 
to disclose or suggest the inventions defined by Applicant's new claims, and provide no teaching 
that would have suggested the desirability of modification to arrive at the claimed inventions. 

As one example, the references fail to disclose or suggest transmitting an image 
containing text in a first language over a network, receiving a translation of the text in a second 
language over the network and displaying the image and the translation simultaneously, as 
recited by claim 37. 

As another example, the references fail to disclose or suggest a device comprising an 
image capture apparatus that obtains an image containing text of a first language, a controller that 
edits the image in response to a commands of a user, a transmitter that transmits the edited image 
over a network and a receiver that receives a translation of the text in a second language over the 
network. 

No new matter has been added by the new claims. 
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CONCLUSION 

All claims in this application are in condition for allowance. Applicant respectfully 
requests reconsideration and prompt allowance of all pending claims. Please charge any 
additional fees or credit any overpayment to deposit account number 50-1778. The Examiner 
invited to telephone the below-signed attorney to discuss this application. 



Date: By: 

SHUMAKER & SIEFFERT, P.A. Name: Daniel J. Hanson 

8425 Seasons Parkway, Suite 105 Reg. No.: 46,757 

St. Paul, Minnesota 55125 
Telephone: 651.735.1100 
Facsimile: 651.735.1102 



-11- 



Docket No.: 1011-001US01 



NETWORK-BASED TRANSLATION SYSTEM 

TECHNICAL FIELD 

The invention relates to electronic communication, and more particularly, to 
electronic communication with language translation. 

BACKGROUND 

The need for real-time language translation has become increasingly important. It is 
becoming more common for a person to encounter foreign language text. Trade with a 
foreign company, cooperation of forces in a multi-national military operation in a foreign 
land, emigration and tourism are just some examples of situations that bring people in contact 
with languages with which they may be unfamiliar. 

In some circumstances, the written language barrier presents a very difficult problem. 
An inability to understand directional signs, street signs or building name plates may result in 
a person becoming lost. An inability to understand posted prohibitions or danger warnings 
may result in a person engaging in illegal or hazardous conduct. An inability to understand 
advertisements, subway maps and restaurant menus can result in frustration. 

Furthermore, some written languages are structured in a way that makes it difficult to 
look up the meaning of a written word. Chinese, for example, does not include an alphabet, 
and written Chinese includes thousands of picture-like characters that correspond to words 
and concepts. An English-speaking traveler encountering Chinese language text may find it 
difficult to find the meaning of a particular character, even if the traveler owns a Chinese- 
English dictionary. 

SUMMARY 

In general, the invention provides techniques for translation of written languages. A 
user captures the text of interest with a client device, which may be a handheld computer, for 
example, or a personal digital assistant (PDA). The client device interacts with a remote 
server to obtain a translation of the text. The user may use an image capture device, such as a 
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digital camera, to capture the text. The digital camera may be integrated or coupled to the 
client device. 

In many cases, an image captured in this way includes not only the text of interest, 
but extraneous matter. The invention provides techniques for editing the image to retain the 
5 text of interest and excise the extraneous matter. One way for the user to edit the image is to 
display the image on a PDA and circle the text of interest with a stylus. When the image is 
edited, the user may translate the text in the image right away, or save the image for later 
translation. 

To obtain a translation of the text in one or more images, the user commands the 

1 0 client device to obtain a translation. The client device establishes a communication 

connection with a remote server over a network, and transmits the images in a compressed 
format to the server. The server extracts the text from the images using optical character 
recognition software, and translates the text with a translation program. The server transmits 
the translations back to the client device. The client device may display an image of text and 

15 the corresponding translation simultaneously. The client device may further display other 
images and corresponding translations in response to commands from the user. 

In one embodiment, the invention presents a method comprising transmitting an 
image containing text in a first language over a network, and receiving a translation of the 
text in a second language over the network. The image may be captured with an image 

20 capture device and edited prior to transmission. After the translation is received, the image 
and the translation may be displayed simultaneously. 

In another embodiment, the invention is directed to a method comprising receiving an 
image containing text in a first language over a network, translating the text to a second 
language and transmitting the translation over the network. The method may further include 

25 extracting the text from the image with optical character recognition. 

In another embodiment, the invention is directed to a client device comprising image 
capture apparatus that receives an image containing text in a first language, and a transmitter 
that transmits the image over a network and a receiver that receives a translation of the text in 
a second language over the network. The device may also include a display that displays the 

30 translation and the image. The device may further comprise a controller that edits the image 
in response to the commands of a user. In some implementations, the device may include an 
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image capture device, such as a digital camera, or a cellular telephone that establishes a 
communication link between the device and the network. 

In a further embodiment, the invention is directed to a server device comprising a 
receiver that receives an image containing text in a first language over a network, a translator 
5 that generates a translation of the text in a second language and a transmitter that transmits 
the translation over the network. The device may also include a controller that selects which 
of many translators to use and an optical character recognition module that extracts the text 
from the image. 

The invention offers several advantages. The client device and the server cooperate to 

10 use the features of modern, fully-featured translation programs. When the client device is 

wirelessly coupled to the network, the user is allowed expanded mobility without sacrificing 
performance. The client device may be configured to work with any language and need not 
be customized to any particular language. Indeed, the client device processes image-based 
text, leaving the recognition and translation functions to the remote server. Furthermore, the 

15 invention is especially advantageous when the language is so unfamiliar that it would not be 
possible for a user to look up words in a dictionary. 

The invention also supports editing of image data prior to transmission to remove 
extraneous data, thereby saving communication time and bandwidth. The invention can save 
more time and bandwidth by transmitting several images for translation at one time. 

20 The user interface offers several advantages as well. In some embodiments, the user 

can easily edit the image to remove extraneous material. The user interface also supports 
display of one or more images and the corresponding translations. Simultaneous display of 
an image of text and the corresponding translation lets the user associate the text to the 
meaning that the text conveys. 

25 The details of one or more embodiments of the invention are set forth in the 

accompanying drawings and the description below. Other features, objects, and advantages 
of the invention will be apparent from the description and drawings, and from the claims. 

BRIEF DESCRIPTION OF DRAWINGS 

30 FIG. 1 is a diagram illustrating an embodiment of a network-based translation system. 
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FIG. 2 is a functional block diagram illustrating an embodiment of a network-based 
translation system. 

FIG. 3 is an exemplary user interface illustrating image capture and editing. 
FIG. 4 is an exemplary user interface further illustrating image capture and editing, 
5 and illustrating commencement of interaction between client and server. 

FIG. 5 is an exemplary user interface illustrating a translation display. 
FIG. 6 is a flow diagram illustrating client-server interaction. 



DETAILED DESCRIPTION 

10 FIG. 1 is a diagram illustrating an image translation system 10 that may be employed 

by a user. System 10 comprises a client side 12 and server side 14, separated from each other 
by communications network 16. System 10 receives input in the form of images of text. The 
images of text may be obtained from any number of sources, such as a sign 18. Other 
sources of text may include building name plates, advertisements, maps and printed 

15 documents. 

In one embodiment, system 10 receives text image input with an imager capture 
device such as a camera 20. Camera 20 may be, for example, a digital camera, such as a 
digital still camera or a digital motion picture camera that can capture a moving image and 
generate a still image. The user directs camera 20 at the text the user desires to translate, and 

20 captures the text in a still image. The image may be displayed on a client device such as a 
display device 22 coupled to camera 20. Display device 22 may comprise, for example, a 
hand-held computer or a personal digital assistant (PDA). 

Often, a captured image includes the text that the user desires to translate, along with 
extraneous material. A user who has captured the text on a public marker, for example, may 

25 capture the main caption and the explanatory text, but the user may be interested only in the 
main caption of the marker. Accordingly, display device 22 may include a tool for editing the 
captured image to isolate the text of interest. An editing tool may include a cursor- 
positionable selection box or a selection tool such as a stylus 24. The user selects the desired 
text by, for example, lassoing or drawing a box around the desired text with the editing tool. 

30 The desired text is then displayed on display device 22. 
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When the user desires to translate the text, the user selects the option that begins 
translation. Display device 22 compresses the image for transmission. Display device 22 
may compress the image as a JPEG file, for example. Display device 22 may further include 
a modem or other encoding/decoding device to encode the compressed image for 
5 transmission. 

Display device 22 may be coupled to a communication device such as a cellular 
telephone 26. Alternatively, display device 22 may include an integrated wireless 
transceiver. The compressed image is transmitted via cellular telephone 26 to remote server 
28 via network 16. Network 16 may include, for example, a cellular telephone network, the 

10 public switched telephone network, an integrated digital services network, satellite network 
or the Internet, or any combination thereof. 

Server 28 receives the compressed image that includes the text of interest. Server 28 
decodes the compressed image to recover the image, and retrieves the text from the image 
using any of a variety of commercially available optical character recognition (OCR) 

15 techniques. [CAN WE LIST ONE OR MORE EXAMPLES? IS IRIS AN EXAMPLE OF 
OCR-BASED SOFTWARE THAT RECOGNIZES FOREIGN TEXT? FROM WHICH 
COMPANY/COMPANIES IS OCR SOFTWARE COMMERCIALLY AVAILABLE?] After 
retrieving the text, server 28 translates the recognized characters using any of a variety of 
commercially available translation programs. [CAN WE LIST ONE OR MORE 

20 EXAMPLES OF TRANSLATION PROGRAMS AND THE COMPANY/COMPANIES 
FROM WHICH THEY ARE COMMERCIALLY AVAILABLE?] Server 28 transmits the 
translation to cellular telephone 26 via network 16, and cellular telephone 26 relays the 
translation to display device 22. 

Display device 22 displays the translation. For the convenience of the user, display 

25 device 22 may simultaneously display, in thumbnail or full-size format, the image that 
includes the translated text. The displayed image may be the image retained by display 
device 22, rather than an image received from server 28. In other words, server 28 may 
transmit the translation unaccompanied by any image data. Because the image data may be 
retained by display device 22, there is no need for server 28 to transmit any image data back 

30 to the user, conserving communication bandwidth and resources. 
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System 10 depicted in FIG. 1 is exemplary, and the invention is not limited to the 
particular system shown. The invention encompasses components coupled wirelessly as well 
as components coupled by hard wire. Camera 20 represents one of many devices that capture 
an image, and the invention is not limited to use of any particular image capture device. 
5 Furthermore, cellular telephone 26 represents one of many devices that can provide an 

interface to communications network 16, and the invention is not limited to use of a cellular 
telephone. 

Furthermore, the functions of display device 22, camera 20 and/or cellular telephone 
26 may be combined in a single device. A cellular telephone, for example, may include the 

10 functionality of a PDA, or a handheld computer may include a built-in camera and a built-in 
cellular telephone. The invention encompasses all of these variations. 

FIG. 2 is a functional block diagram of an embodiment of the invention. On client 
side 12, the user interacts with client device 30 through an input/output interface 32. In a 
client device such as a PDA, the user may interact with client device 30 via input/output 

1 5 devices such as a display 34 or stylus 22. Display 34 may take the form of a touchscreen. 
The user may also interact with client device 30 via other input/output devices, such as a 
keyboard, mouse, touch pad, push buttons or audio input/output devices. 

The user further interacts with client device 30 via image capture device 36 such as 
camera 20 shown in FIG. 1 . With image capture device 36, the user captures an image that 

20 includes the text that the user wants to translate. Image capture hardware 38 is the apparatus 
in client device 30 that receives image data from image capture device 36. 

Client translator controller 40 displays the captured image on display 34. The user 
may edit the captured image using an editing tool such as stylus 22. In some circumstances, 
an image may include text that the user wants to translate and extraneous information. The 

25 user may edit the captured image to preserve the text of interest and to remove extraneous 
material. The user may also edit the captured image to adjust factors such as the size of the 
image, contrast or brightness. Client translator controller 40 edits the image in response to the 
commands of the user and displays the edited image on display 34. Client translator 
controller 40 may receive and edit several images, displaying the images in response to the 

30 commands of the user. 
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In response to a command from the user to translate the text in one or more of the 
images, client translator controller 40 establishes a connection with network 16 and server 28 
via transmitter/receiver 42. Transmitter/receiver 42 may include an encoder that compresses 
the images for transmission. Transmitter/receiver 42 transmits the image data to server 28 

5 via network 16. Client translator controller 40 may include data in addition to image data in 
the transmission, such as an identification of the source language as specified by the user. 

Network 16 includes a transmitter/receiver 44 that receives and decodes the image 
data. A server translator controller 46 receives the decoded image data and controls the 
translation process. An optical character recognition module 48 receives the image data and 

10 recovers the characters from the image data. The recovered data are supplied to translator 50 
for translation. In some servers, recognition and translation may be combined in a single 
module. Translator 50 supplies the translation to server translator controller 46, which 
transmits the translation to client device 30 via transmitter/receiver 44 and network 16. 
Client device 30 receives the translation and displays the translation on display 34. 

15 Server 28 may include several optical character recognition modules and translators. 

Server 28 may include separate optical character recognition modules and translators for 
Japanese, Arabic and Russian, for example. Server translator controller 46 selects which 
optical character recognition module and translator are appropriate, based upon the source 
language specified by the user. 

20 FIG. 3 is an exemplary user interface on client device 30, such as display device 22, 

following capture of an image 60. Image 60 includes text of interest 62 and other extraneous 
material 64, such as other text, a picture of a sign, and the environment around the sign. The 
extraneous material is not of immediate interest to the user, and may delay or interfere with 
the translation of text of interest 62. The user may edit image 60 to isolate text of interest 62 

25 by, for example, tracing a loop 66 around text of interest 62. Client device 30 edits the image 
to show the selected text 62. 

FIG. 4 is an exemplary user interface on client device 30 following editing of image 
60. Edited image 70 includes text of interest 62, without the extraneous material. Edited 
image 70 may also include an enlarged version of text of interest 62, and may have altered 

30 contrast or brightness to improve readability. 
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Client device 30 may provide the user with one or more options in regard to text of 
interest 62. FIG. 4 shows two exemplary options, which may be selected with stylus 24. One 
option 72 adds selected text 62 to a list of other images including other text of interest. In 
other words, the user may store a plurality of text-containing images for translation, and may 

5 have any or all of them translated when a connection to server 28 is established. 

Another option is a translation option 74, which instructs client device 30 to begin the 
translation process. Upon selection of translation option 74, client device 30 may present the 
user with a menu of options. For example, if several text-containing images have been stored 
in the list, client device 30 may prompt user to specify which of the images are to be 

10 translated. Client device 30 may further prompt the user to provide additional information, 
such as specifying the source language, i.e. the language of the text to be translated and the 
target language, i.e., the language with which the user is more familiar. [IS THERE ANY 
OTHER DATA FOR WHICH THE CLIENT DEVICE MAY PROMPT THE USER?] 
When the user gives the instruction to translate, client device 30 establishes a 

15 connection to server 28 via transmitter/receiver 42 and network 16. Server 28 performs the 
optical character recognition and the translation, and sends the translation back to client 
device 30. Client device 30 may notify the user that the translation is complete with a cue 
such as a visual prompt or an audio announcement. 

FIG. 5 is an exemplary user interface on client device 30 following translation. For 

20 the convenience of the user, client device 30 may display a thumbnail view 80 of the image 
that includes the translated text. Client device 30 may also display a translation of the text 
82. Client device 30 may further provide other information 84 about the text, such as the 
English spelling of the foreign words, phonetic information or alternate meanings. A scroll 
bar 86 may also be provided, allowing the user to scroll through the list of images and their 

25 respective translations. An index 88 may be displayed showing the number of images for 
which translations have been obtained. 

FIG. 6 is a flow diagram illustrating an embodiment of the invention. On client side 
12, client device 30 captures an image (100) and edits the image (102) according to the 
commands of the user. In response to the command of the user to translate the text in the 

30 image, client device 30 encodes the image (104) and transmits the image (106) to server 28 
via network 16. 
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On server side 14, server 28 receives the image (108) and decodes the image (1 1 0). 
Server 28 extracts the text from the image with optical character recognition module 48 (1 12) 
and translates the extracted text (1 14). Server 28 transmits the translation (116) to client 
device 30. Client device 30 receives the translation (1 1 8) and displays the translation along 

5 with the image ( 1 20). 

The invention offers many advantages. By performing optical character recognition 
and translation on server side 14, the user receives the benefit of the translation capability of 
the server, such as the most advanced versions of optical character recognition software and 
the most fully- featured translation programs. The user further has the benefit of multi- 

10 language capability. A particular server may be able to recognize and translate several 
languages, or the user may use network 16 to access any of a number of servers that can 
recognize and translate different languages. The client device is therefore flexible and need 
not be customized to any particular language. 

The invention may be used with any source language, but is especially advantageous 

15 for a user who wishes to translate written text in a completely unfamiliar written language. 
An English-speaking user who sees a notice in Spanish, for example, can look up the words 
in a dictionary because the English and Spanish alphabets are similar. An English-speaking 
user who sees a notice in Japanese, Chinese, Arabic, Korean or Cyrillic, however, may not 
know how to look up the words in a dictionary. The invention provides a fast and easy to 

20 obtain translations even when the written language is totally unfamiliar. 

Furthermore, the communication between client side 12 and server side 14 is 
efficient. Image data from client side 12 may be edited prior to transmission to remove 
extraneous data. The edited image is usually compressed to further save communication time 
and bandwidth. Translation data from server side 14 need not include images, which further 

25 saves time and bandwidth. Conservation of time and bandwidth reduces the cost of 

communicating between client device 30 and server 28. Client device 30 further reduces 
costs by saving several images for translation, and transmitting the images in a batch to 
server 28. 

The user interface offers several advantages as well. The editing capability of client 
30 device 30 lets the user edit the image directly. The user need not edit the image indirectly, 
such as by adjusting the field of view of camera 20 until only the text of interest is captured. 
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The user interface is also advantageous in that the image is displayed with the translation, 
allowing the user to compare the text that the user sees to the text shown on display 34. 

Although the invention encompasses hard line and wireless connections of client 
device 30 to network 16, wireless connections are advantageous in many situations. A 
wireless connection allows travelers, such as tourists, to be more mobile, seeing sights and 
obtaining translations as desired. 

Including recognition and translation functionality on server side 14 also benefits 
travelers by saving weight and bulk on client side 12. The user need not carry any 
specialized equipment to accommodate the idiosyncrasies any particular written language. 
The equipment on the client side works with any written language. 

Several embodiments of the invention have been described. Various modifications 
may be made without departing from the scope of the invention. For example, server 28 may 
provide additional functionality such as recognizing the source language without a 
specification of a source language by the user. Server 28 may send back the translation in 
audio form, as well as in written form. 

Cellular phone 26 is shown in FIG. 1 as an interface to network 16. Although cellular 
phone 26 is not needed for an interface to every communications network, the invention can 
be implemented in a cellular telephone network. In other words, a cellular provider may 
provide visual language translation services in addition to voice communication services. 
These and other embodiments are within the scope of the following claims. 
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CLAIMS: 

1 . A method comprising: 

transmitting an image containing text in a first language over a network; and 
receiving a translation of the text in a second language over the network. 

2. The method of claim 1, wherein the image is a second image, the method further 
comprising: 

capturing a first image containing the text in the first language; 
receiving instructions to edit the first image; and 

editing the first image to generate the second image in response to the instructions. 

3. The method of claim 1 , further comprising displaying the image. 

4. The method of claim 1 , further comprising displaying the image and displaying the 
translation of the text in the second language simultaneously. 

5. The method of claim 1, further comprising establishing a wireless connection with the 
network. 

6. The method of claim 1, wherein the image is a first image containing first text, the 
method further comprising: 

transmitting a second image containing second text in the first language over the 
network; and 

receiving a translation of the first text and the second text in the second language over 
the network. 

7. The method of claim 6 5 further comprising transmitting the first image and the second 
image over a network in response to a single command from a user. 
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8. The method of claim 6, further comprising displaying one of the translation of the 
first text and the translation of the second text in response to a command from a user. 

9. The method of claim 1 , further comprising compressing the image. 

5 

10. The method of claim 1 , further comprising receiving the image from an image capture 
device. 

11. A method comprising: 

10 receiving an image containing text in a first language over a network; 

translating the text to a second language; and 
transmitting the translation over the network. 

12. The method of claim 1 1, further comprising extracting the text from the image with 
1 5 optical character recognition. 

13. The method of claim 1 1, further comprising receiving a specification of the first 
language. 

20 14. A device comprising: 

an image capture apparatus that receives an image containing text in a first language; 
a transmitter that transmits the image over a network; and 
a receiver that receives a translation of the text in a second language over the 
network. 

25 

15. The device of claim 14, further comprising a display that displays the translation. 

16. The device of claim 14, further comprising a display that displays the translation and 
the image simultaneously. 

30 
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17. The device of claim 14, further comprising a controller that edits the image in 
response to the commands of a user. 

1 8. The device of claim 14, further comprising an image capture device that supplies the 
5 image to the image capture apparatus. 

19. The device of claim 18, wherein the image capture device is a digital camera. 

20. The device of claim 14, further comprising a cellular telephone that establishes a 
10 communication link between the device and the network. 

21. A device comprising; 

a receiver that receives an image containing text in a first language over a network; 
a translator that generates a translation of the text in a second language; and 
1 5 a transmitter that transmits the translation over the network. 

22. The device of claim 21, further comprising a controller that selects the translator as a 
function of the first language. 

20 23. The device of claim 2 1 , further comprising an optical character recognition module 
that extracts the text from the image. 

24. A system comprising: 

a client device having an image capture apparatus that receives an image containing 
25 text in a first language, a client transmitter that transmits the image over a network to the 
server, a client receiver that receives a translation of the text in a second language over the 
network from the server; and 

a server having a receiver that receives the image over the network from the client, a 
translator that generates a translation of the text in the second language; and a transmitter that 
30 transmits the translation over the network to the client. 
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25. The system of claim 24, the server further comprising an optical character recognition 
module that extracts the text from the image. 



5 
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NETWORK-BASED TRANSLATION SYSTEM 
ABSTRACT 

The invention provides techniques for translation of written languages using a 
network. A user captures the text of interest with a client device and transmits the image over 
the network to a server. The server recovers the text from the image, generates a translation, 
and transmits the translation over the network to the client device. The client device may 
also support techniques for editing the image to retain the text of interest and excise 
extraneous matter from the image. 
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